Skip to content

Add environment DeviceMerge #7969

Open
gonidelis wants to merge 4 commits intoNVIDIA:mainfrom
gonidelis:merge_env
Open

Add environment DeviceMerge #7969
gonidelis wants to merge 4 commits intoNVIDIA:mainfrom
gonidelis:merge_env

Conversation

@gonidelis
Copy link
Member

@gonidelis gonidelis commented Mar 10, 2026

fixes #7543

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 10, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 10, 2026
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Mar 10, 2026
@gonidelis gonidelis changed the title Merge env Add environment DeviceMerge Mar 10, 2026
@gonidelis gonidelis marked this pull request as ready for review March 10, 2026 07:26
@gonidelis gonidelis requested a review from a team as a code owner March 10, 2026 07:26
@gonidelis gonidelis requested a review from elstehle March 10, 2026 07:26
@gonidelis gonidelis enabled auto-merge (squash) March 10, 2026 07:26
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Mar 10, 2026
@github-actions
Copy link
Contributor

😬 CI Workflow Results

🟥 Finished in 2h 04m: Pass: 15%/249 | Total: 4d 12h | Max: 2h 03m | Hits: 88%/55533

See results here.

Comment on lines +162 to +164
::cuda::std::enable_if_t<!::cuda::std::is_same_v<KeyIteratorIn1, void*>
&& !::cuda::std::is_same_v<KeyIteratorIn1, ::cuda::std::nullptr_t>,
int> = 0>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remark: that's an interesting constraint since it guards against using the old overload with nullptr. It's fine. You need it here I guess because you cannot constraint a second template argument since it's just int64 here and a size_t from the other overload would convert to int64 well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still get an ambiguity:

cub::DeviceMerge::MergeKeys(
    thrust::raw_pointer_cast(temp_storage.data()),  // KeyIteratorIn1 = char*
    temp_storage_bytes,                              // num_keys1 (size_t -> int64)
    keys1.begin(),                                   // KeyIteratorIn2
    static_cast<int>(keys1.size()),                  // num_keys2 (int -> int64)
    keys2.begin(),                                   // KeyIteratorOut
    static_cast<int>(keys2.size()),                  // CompareOp = int
    result.begin());                                 // EnvT = iterator

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need extra constraints

@gonidelis
Copy link
Member Author

After long discussion with @bernhardmgruber we decided not to specify any deterministic guarantees for DeviceMerge


// %PARAM% TEST_LAUNCH lid 0:1:2

#include <cuda/__execution/require.h>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might not be needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

Add env-based API for cub::DeviceMerge

2 participants